TransAtlasDB: an integrated database connecting expression data, metadata and variants
نویسندگان
چکیده
High-throughput transcriptome sequencing (RNAseq) is the universally applied method for target-free transcript identification and gene expression quantification, generating huge amounts of data. The constraint of accessing such data and interpreting results can be a major impediment in postulating suitable hypothesis, thus an innovative storage solution that addresses these limitations, such as hard disk storage requirements, efficiency and reproducibility are paramount. By offering a uniform data storage and retrieval mechanism, various data can be compared and easily investigated. We present a sophisticated system, TransAtlasDB, which incorporates a hybrid architecture of both relational and NoSQL databases for fast and efficient data storage, processing and querying of large datasets from transcript expression analysis with corresponding metadata, as well as gene-associated variants (such as SNPs) and their predicted gene effects. TransAtlasDB provides the data model of accurate storage of the large amount of data derived from RNAseq analysis and also methods of interacting with the database, either via the command-line data management workflows, written in Perl, with useful functionalities that simplifies the complexity of data storage and possibly manipulation of the massive amounts of data generated from RNAseq analysis or through the web interface. The database application is currently modeled to handle analyses data from agricultural species, and will be expanded to include more species groups. Overall TransAtlasDB aims to serve as an accessible repository for the large complex results data files derived from RNAseq gene expression profiling and variant analysis. Database URL: https://modupeore.github.io/TransAtlasDB/ VC The Author(s) 2018. Published by Oxford University Press. Page 1 of 15 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. (page number not for citation purposes) Database, 2018, 1–15 doi: 10.1093/database/bay014
منابع مشابه
Design and Implementation of a Comprehensive Database of the Written Heritage of Science and Technology
Purpose: This study aims to design and implement a comprehensive database of the written heritage of science and technology in the Regional Information Center for Science and Technology (RICeST) and determine the metadata elements required to describe the manuscripts. Method: This study was carried out by the content analysis method to identify the metadata elements needed to describe the coll...
متن کاملMetadata Enrichment for Automatic Data Entry Based on Relational Data Models
The idea of automatic generation of data entry forms based on data relational models is a common and known idea that has been discussed day by day more than before according to the popularity of agile methods in software development accompanying development of programming tools. One of the requirements of the automation methods, whether in commercial products or the relevant research projects, ...
متن کاملبررسی پایگاه های کتاب الکترونیکی با تاکید بر ابر داده
Introduction: With the exponential growth of electronic resources on the Web, the application of metadata has enhanced the precision of retrieval and facilitated the search of electronic resources. Hence, the aim of this study was to determine the application of metadata in e-book databases. Methods: This study is an applied work, which was carried out through survey methods in 2013. The pop...
متن کاملNbtadata kknagement for Large Statistical Databases
Data description or metadata presents a significant database management challenge, particularly for scientific and statistical databases. Ideally, we would llke to access and manipulate data and metadata using the same DBMS tools, but there are few systems that even begin to provide such integrated capabilities. This paper outlines a framework for more integrated metadata management by synthesi...
متن کاملNCGIA National Center for Geographic Information and Analysis A Conceptual Framework for Integrated Metadata Management in Very Large Spatial Databases
A conceptual framework for integrated metadata management in large spatial databases is described. The primary function of this framework is to allow definition, location and control of metalevel information about the underlying database. The framework provides for a set of core metadata components and allows for addition of any auxiliary metadata that the user might want to define. The framewo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 2018 شماره
صفحات -
تاریخ انتشار 2018